# Full text of "A comparison of single and simultaneous equation techniques"

## See other formats

Historic, Archive Document i Do not assume content reflects current scientific knowledge, policies, or practices. UNITED STATES DEPARTMENT OF AGRICULTURE - AGRICULTURAL MARKETING SERVICE A COMPARISON OP SINGLE AND SIMULTANEOUS EQUATION TECHNIQ.UES* tiy Richard J. Foote y; In my opinion, much needless confusion exists in the minds of economists and statisticians when they think about least squares versus simultaneous equation techniques. Some analysts believe that the method of least squares now is completely outmoded; others feel that simultane- ous equation methods are so complex and computationally expensive that they should be avoided whenever possible* Each of these viewpoints is wrong. Simultaneous equation techniques are a useful addition to our kit of tools for use in problems that deal with regression analysis. When they are needed, they should be used, Just as a hack saw is used to cut metal, whereas various woodsaws can be used to cut wood. In systems of equations, one, s,^yaral, or perhaps all frequently can be fitted by least squares. Moreover, least squares equations are useful now. Just as they always have been, in showing normal or average relationships that exist between sets of variables. Many problems that relate to regression analysis, such as choice of variables, location of data, choice of f\mc- tional forms, and the testing and interpretation of results, are almost identical regardless of whether the equations are to be fitted by least squares or by simultaneous equation techniques. Because of the confusion that exists, I propose to stairt with some extremely elementary concepts with which I am sure you are all familiar. We shall then proceed step by step in such a way as to show precisely when and why simultaneous equation techniques are needed. In later sec- tions, I shall discuss some computational aspects of regression analysis and some considerations relating to the degree of complexity that -may be desirable in formulating a system of economic relationships. Some Economic Considerations 'In 1927; Elmer Working gave an excellent discussion of what now is called the identification problem in his classic paper "What Do Statistical ’Demand Curves’ Show?".l/ He pointed out that when a research worker begins a demand study, he is confronted with a set of dots like that shown in section A of the chart on p. 3 . He knows that each can be thought of as the intersection of a demand and a supply curve, as in section B, but, without further information, neither curve can be deter- mined from the data. Working then noted that if the demand curve has * Prepared for presentation before the American Farm Economic Associa- tion, August 1, 1955* The author wishes to thank Glen Burrows of the Agricultural Marketing Service for a number of helpful suggestions on the statistical aspects of this paper. 1/ Quart. Jour. Econ. 41:212-235, I 927 . A similar line of reasoning is followed by Koopmans, TJalling C, Identification Problems in Economic Model Construction, pp. 27-35* This is given as Chap. II of Studies in Econometric Method. Cowles Commission for Research in Economics Monogr. 1^+, 1953. ;■ Agr icultur e-Washington July 1955 shifted over time but the supply curve has remained relatively stable, as in section C, the dots trace out a supply cxirve; conversely, if the supply ciorve has shifted but the demand curve has remained stable, as in section D, the dots trace out a demand curve. If correlated shifts for each curve have taken place, as in section E, the dots trace out what may look like a structural demand or supply curve, but the slope will be too flat or too steep. In many analyses of the demand for agricultural products, factors that cause the demand curve to shift over time are included as separate variables in a multiple regression equation. In effect, we are then able to derive from our estimating equation an average demand curve. This is indicated in a rough way in section F. In some analyses, we can assume that the quantity supplied is essentially vinaffected by current price. V/hen price is plotted on the vertical scale, the supply curve in such cases is a vertical line, and year-to-year shifts in the supply curve trace out a demand curve, Just as they did in section D. Under these circumstances we may be able to obtain valid estimates of the elasticity of demand by use of a least squares multiple regression analysis for which price is the dependent variable and supply and scane demand shifters are used as independent variables. This point was noted by Working in his 1927 paper 2/, emphasized by Ezekiel in a paper published in I928 3/^ and reconsidered in 1953 "by Fox 4/ in the light of modem simultaneous""equa- tions theory. For many agricultural products, this set of circumstances permits us to estimate elasticities of demand with respect to price by use of single equation methods. Two points, however, should be kept in mind: (l) Price must be used as the dependent variable in order to obtain elasticity estimates that are statistically consistent since, to use the least squares technique, the supply curve must be a vertical line; and (2) an algebraic transformation must -be made after the equation has been.,.- fitted to derive the appropriate coefficient of elasticity, since the definition is 'in terms of the percentage change in quantity associated with a given percentage change in price. Other circumstances mder which least squares equations can be used to derive coefficients of elasticity are discussed in a later section. What happens if we have a supply curve that is not a vertical line? If we consider any single point, as in section G, we have no way of know-:^ ing on which demand and supply curve of a whole family of curves it lies. The basic problem of indeterminateness is similar to that in which corre- lated shifts in the demand and supply curves take place. What is needed is some hypothesis, adequately tested and proven to be sound, as to the nature of the Joint relationships between supply and demand. We should then be able to untangle the two and to obtain a reliable estimate of the slope of each curve. This is essentially what is involved in the simul- taneous equations approach. This concept was set forth by Haavelmo in 1943* 5/ Staff members of the Cowles Commission spent a considerable part of the next 10 years in showing how to implement it when working with actual data. * i 2/ Op. Cit. , p. 223. 3/ Ezekiel, Mordecai. Statistical Analyses and the, "Laws” of Price. Quart . Jour. Econ. 42:199-225- 4/ Fox, Karl A. The Analysis of Demand„fpr Farm Products. U. S. Dept. Agr. Tech. Bui. 1081. 5/ Haavelmo, Trygve. The Statistical Implications of a System of Simulta>- neous Equations. Econometrica. 11:1-12, 1943- A more complete discussion by the same, author is given in The Probability Approach to Econometrics. Econometrica, Vol. 12, Supplement, 1944. - 3 - SUPPLY-DEMAND RELATIONSHIPS SECTION A SECTION B FREQUENCY SECTION H True /3| 301* Limited . information method N Least squares -« .46 .48 .50 .52 .54 +« / ESTIMATE OF ^ I U. S. DEPARTMENT OF AGRICULTURE NEG. 1695-55 (6) AGRICULTURAL MARKETING SERVICE « 4 - Suppose, however, that the analyst has no interest in the true demand and supply curves hut only wants a method that will assist him in studying probable future trends in prices. Working had some suggestions on this point, too. He said, "It does not follow from the foregoing analysis that, when conditions are such that shifts of the supply and de- mand curves are correlated, an attempt to construct a demand curve will give a result that will be useless. Even though shifts of the supply and demand curves are correlated, a curve which is fitted to the points of intersection will be useful for purposes of price forecasting, provided no new factors are introduced which did not affect the price during the period of study, Thus, so long as the shifts of the supply and demand ' curves remain correlated in the same way, and so long as they shift through approximately the same range, the curve of regression of price upon quantity can be used as a means of estimating price from quantity. "6/' The problem here is that the shifts almost never "remain correlated in the same way" over a sufficiently long period to generate enough data to fit our equation. In some circumstances, changes in structure are so frequent that multiple regression equations almost always yield low cor- relations and frequently even "wrong" signs on the coefficients. This is particularly apparent when we attempt, by the single equation approach, to study factors that affect volume of exports. In other cases, changes in structure are of minor importance and least squares equations may yield completely satisfactory results in terms of showing relationships that have prevailed between simultaneously-determined economic variables over a considerable period of time. This is frequently true when we study relationships between prices at specified locations or at local market, wholesale, or retail levels. Even here, however, the analyst should examine his results closely, perhaps by plotting the data in scatter diagrams, to determine whether changes in structure have affected the relationship. The in-between case is the one that can be dangerous. Here the coefficients may suggest that the analysis is satisfactory; it may, in fact, be of little value for the study of future trends. Marschak jJ gives an interesting example of the importance of changes in strucutre on the need for using a complete system of equations. He considers the old problem of taxation of a monopoly. He points out, "Knowledge is useful if it helps to make the best decisions". He con- siders, among other things, the kinds of knowledge that are useful to guide the firm in its choice of the most profitable output level. If the tax rate has not changed in the past and is not expected to change, the firm can fit an empirical curve to observed data on output and profits and immediately derive the point of maximum revenue. If the tax rate has ' not changed in the past but is expected to cheuige in the fut^ire, the firm could, if it so desired, vary its output and profits under the new tax structure and derive a new empirical relation. But this takes time, and substantial losses might occur during the experimental period. If the firm had taken the trouble to derive the strucutral demand and net revenue p. 227. 7/ Mau^sohak, Jacob. Economic Measurements for Policy and Prediction, Chap. I in Studies in Econometric Method, Cowles Commission for Research in Economics Monogr. l4, 1953. i^urvesji.it could immediately have deteriHined-'its most profitable output under :':the new tax structiare. If the tax rate had varied during the ini- tial period, an empirical regression of net revenue on output and the tax rate could have been fitted and used to find the most profitable output under the new tax structure. In many real life situations, changes in structure are frequent. Hence Marschak concludes: ’'A theory may appear unnecessary for policy decisions (or forecasting) until a certain struc- tural change is expected or intended. It becomes necessary then. Since it is difficult to specify in advance what structural changes may be visualized later, it is almost certain that a broad analysis of economic structure, later to be filled cut in detail according to needs, is not a wasted effort". This argument in no way invalidates the use of a single equation to estimate elasticities of demand in these cases in which the supply can be considered as unaffected by current price. In such cases, we may obtain estimates of the structural parameters that can be used in the same way as anyi.other statistically- valid estimates. Instead, Marschak is arguing -ithat only rarely should the economic analyst be satisfied with' a purely empirical fit if he can obtain structural relationships with some additional work. Some S tatistical Considerations We now turn to some statistical considerations that have a bearing on the extent to which we can use least squares to estimate the coeffi- cients in a given equation. In the relationships discussed in the pre- ceding section, we have assumed, more or less implicitly, that the points lie exactly on the demand or supply curve. In actual statistical analyse this is never true, since some variables that cause the curves to shift always are omitted and the precise^ shape of the curves to be fitted are not known. Thus we assume that we are dealing with stochastic rather than functional relations. A stochastic relation basically is one that in- cludes a set of unexplained residuals or error terms whose direction and magnitude are usually not known exactly for any particular set of calcu- lations, but whose behavior on the average over repeated samples can be described or assumed. In order to have a concrete example about which to talk, let us consider the following equation: Y = a + b + ligZg + u (i) ^ Here Y is the variable for which as estimate is desired, the Z's are 2 variables which are known to affect Y, and u is an error term. We assume that.^:|or a number of periods we know the vedue- of Y and the Z's and we wish estimate a, b^^, and We do not know the value of u but can estimate" ii iu a rough way for any given period as the difference between the value of Y computed from the equiation and its actual value. We know that estimates of the regression coefficients will differ for different sets of observations. However, we would like to estimate them in such a way that the average value for a large n'umber of periods or samples equals the value that would be obtained from a similar calculation based on the combined evidence of all possible samples.. Estimates of this sort are known statistically as unbiased estimates. We also would like the variation of the estimates about their average or tine value to be as small as possible, since under this circumstance we would have more confidence in any single estimate than if we had a large amount of variation. Esti- mating procedures that give the smallest possible variance are known as best estimates. Despite their name, such estimates possess no more desir- able properties than many alternative estimates. So the choice of ter- minology is unfortunate, but it has become firmly established. In certain i circijmstances, we may be unable to obtain best unbiased estimates but may be able to obtain estimates that are consistent and efficient .> A consistent estimate is one that is unbiased when we work with all the possible data; it may or may not be biased in small samples . ' In actual practice, of coiorse, we never have all possible data, but estimation procedures that give consistent estimates presumably are better even with small samples than are those that are known to be biased even with an infinitely large sample. Efficient estimates are similar to ’’best" estimates, except that they are known to give the smallest possible variance only when we work with all possible data. * *. ' 1 If we use the method of least squares to estimate .the. coefficients in equation (l), we obtain best unbiased estimates if the u's and Z's meet certain rather rigid specifications. Some of the specifications that re- late to the u's are difficult to state precisely in nonmathematical terms, but essentially they require that the u's follow some (not necessarily a normal) probability distribution, that”"their average or expected value be zero, that their variance be finite and independent of the Z‘S (the latter is the property of homoscedasticity, for those of you who remember your elementary statistics), and, finally, they must be serially independent. When working with economic data, we usually assvime that these assumptions hold; but we may test at least the one regarding serial independence of the residuals after we have run the analysis. In some cases, we transform the data to (l) logarithms, which frequently helps to render the variance of u less dependent upon the Z's, or (2) first differences, which when work- ing with economic data frequently tends to reduce the serial correlation in the u’s. As an alternative we may, of course, use first differences of logarithms. In addition there is. a specification regarding- the Z's- that is easily stated but frequently disregarded by economists. To be certain that the least squares approach will give' best unbiased estimates, each Z must be i — hnown numbers , .in contrast to a random variable. When attempting to obtain elasticities of demand, this is true only in rare instances. The only case of which I can think is the one for which prices are arbitrarily set at certain levels, as in a retail store experiment, and the quantity bought by consumers at these prices are recorded. I know of only one ex- periment that has been conducted in this way--that for oranges by Godwin. 8/ The least squares method was developed for use in connection with experiments 8/ Godwin, Marshall R. Customer Response to Varying Prices for Florida Oranges. Fla. Agr. Expt. Sta. Bui. 508. 1952. - 7. - in the physical sciences, where the independent variables frequently are sets of kncnm niambers, or to study relationships between variables, such as heights of fathers and heights of sons, where no. ’’structural'', cients are involved. Econometricians have shown that the least squares approach will give estimates of the regression coefficients that are statistically consistent and efficient, provided the u's meet approximately, the same requirements as for the previous case, if the Z’s are predetermined variables. A sim- plified proof of this is given by*” Klein 9 A although a fairly advanced knowledge of calculus and of the principles of probability given in chap- ter 2 of his book are required to follow his development. Just as the term "best” as used by statisticians is an unfortunate one, the term "predetermined" as used by econcanetri clans is unfortunate. But as it now is commonly used in the literature, it seems advisable to learn what it means so that we can continue to use it. Predetermined variables include those that the analyst takes as given, while endogenous variables are those that are determined simultaneously by the same set of economic forces and, therefore, are to be estimated from the model. The predetermined variables commonly are divided into exogenous variables and lagged values of endo- genous variables. Lagged values of the endogenous variables, naturally, are values of these variables for a previous time period. Exogenous variables include all other variables that might enter into an analysis. Weather is a commonly cited exogenous variable. For many variables, the exact economic structure of the segment of the economy being studied must be carefully considered to determine whether they should be classified as endogenous or exogenous, and frequently different analysts. will classify them in different ways. We now can reconsider an example cited earlier. In section F we showed a diagrammatic representation of a situation for which the least squares method could be used to estimate the slope of the demand curve. We now know that this estimate will be statistically consistent and effi- cient only if the quantity consumed and the demand shifters each can be classified as an exogenous or lagged endogenous variable. Fox 10/ has argued thdt this is approximately true for a considerable number of agri- cultural products, including meat, poultry and eggs, feed grains, and a number of fresh fruits and vegetables. Under the assumptions of Fox, market price is used as the dependent variable and the independent vari- ables are supply (which is assumed to be highly correlated with consumption) and some relevant demand shifters, including usually disposable income. Another situation under which we can use the method of least squares to estimate elasticities of demand is that for which data are available on purchases or consumption of individual consumers, as prices that confront consumers are determined chiefly by factors other than those that affect their purchases. In this case, consumption is taken as the dependent variable and retail prices of the various items, family income, and per- haps other household characteristics, are taken as independent variables. 9/ Klein, Lawrence. R. A Textbook of Econometrics. . Row, Peterson and Co., 1953; PP- 80-85. 10/ Cit. - 8 - One further problem needs to be considered before leaving the statistical aspects of this subject. In all cases given previously, we have assumed implicitly that the variables are known without error. Any analyst who has been connected with the compilation of data from original sources kpows that errors of one sort or another always creep in. These can result from memory bias on the part of respondents, inability to find all of the people in a complete census, errors of sampling, and a host of other reasons beyond the control of the most careful investigator. Whenever we work with economic series, nonnegligible errors in the data are known to exist. Fox has suggested an easy way to correct for this, provided we are willing to make an informed guess about the average magnitude of the errors, 11 / Such guesses or estimates can be made by carefully reac^ng' how the series was compiled, or preferable, by talking to the persQp ih!' charge of the compilation. For most series, we have a fairly good idea as to whether the error is in terms of, say, a few percent, 10 to 20 per- cent, or something higher. If we ^sh, we can experiment idth various assumptions about the level of error, compare the effect on our coef- ficients, and then arrive at a rather good idea as to the effect which likely errors may have on their magnitude. Because of the importance of a correction of this sort, I would like to show you exactly what is involved in an extremely simple case. All of you know, I am sure, that when we express variables in terms of deviations from their respective means, the least squares regression coefficient between two variables is given by: Syx Suppose that y and x are each subject to error and that the magnitude of the error is given by d and e, respectively. The sample regression coef- ficient then equals: ~ s S(y*d)(x+e) S(x+e)2 Let us now expand the numerator and denominator® We obtain: 2(y+d)(x+e) « Syx + Sye + Sdx + Sde S(x+e)2 « Sx^ + 2Sxe + Se^ 11 / This approach and some examples based on it are described in some detail by Foote, Richard J. and Fox, Karl A, Analyiiical Tools for Measuring Demand. U.S. Dept. Agr. Agr. Handb. 6U, 195U, pp. 29-35 • _If d and'e are random a:id independent^ and we have a large sample, any sd^ation" term that involves one cr "both of them as a cross-product equals approximately sere. Summation terms that involve their squares, howeyer:)-"do not equal zero. Applying this principle, when the variables are. sub ject to error, we can write the value of the regression coeffi- cient in the following way: * byx di: zyx Q ' P + Se^ This gives a biased estitnate of by^ because the denominator is too large. But if we reduce the sum of squares for the independent variable by an amount proportional, to the assumed percentage error, then we obtain an estimate of the regression coefficient that is approximately unbiased provided the only source of bias is that due to errors in the data. Such a correction can be made easily. In working with multiple regression coefficients, some formulas involve sums of squares for the dependent variable also, so all sums of squares that enter into the analysis should be corrected in a similar way. I see no reason why the same sort of cor- rection could not be applied to the sums of squares that are involved in equations to be fitted by the simultaneous equations approach. Some Bconometric Considerations Discussion in the preceding section suggests, and econometricians have shown, that if we use the least squares approach to estimate the regression coefficients in an equation that contains current values of 2 or more endogenous variables, we obtain estimates that are statisti- cally biased. The mathematical nature of the bias has been shown by a number of authors in a supposedly popular way, 12/ but I have yet to find an explanation that is completely satisfactory for a nonmathemati- cian. We do> however, have some experimental evidence of the kind of bias that results when the method cf least squares is applied in such cases. Methods which have been developed to handle equations that con- tain 2 or more endogenous variables are known to give estimates that are statistically consistent, but methods are not now available that are known to be statistically unbiased. Thus, we cannot say for sure what happens when we work with the small samples tha.t usually are involved in economic research. An experiment was designed to measure the kind of bias that arises when we apply these methods to small samples 13 / and, as a byproduct of this experiment, we have seme concrete evidence of the kind cf bias that may arise when we use the method of least squares instead, 12/ See, for example, Bronfenbrenner, Jean. Sources and Size of Least- Squares Bias in a Two-Equation Model, Chap. IX in Studies in Econometric Method. Cowles Commission for Research in Economics Monogr. l4, 1953j Bennion, E.G. The Cowles Commission's "Simultaneous-Equation Approach"; A Simplified Explanation. Review Econ. and Statis., 34:4-9-56, 1952; and Meyer, John R, and Miller, Henry Lawrence Jr. Some Comments on the "Simultaneous-Equations" Approach. Review Econ. and Statis, 36:88-92.1954. 13 / Wagner, Harvey M. A Monte Carlo Study of Estimates of Simultaneous Linear Structural Equations. Tech. Rpt. 12, Dept, of Econ., Stanford Univ. 1954 . -* 10 In this experiment, a simple 3 -equation model was formulated with known coefficients. Variables generated by the model were obtained and random error terns added to them. Two thousand observations were obtained in this way, and they were then divided into 100 samples of 20 observations each. The first equation contains 2 endogenous variables. Regression coefficients were obtained for this equation for each sample by the limited information approach, which I will discuss later, and by the method of least squares. Since there were 100 samples, 100 separate estimates of the single regression coefficient involved was obtained by each approach. Frequency distributions of these estimates are shown in section H of the chart, together with the true value of the coefficient. Each method gives estimates that are biased, but the 3 highest frequen- cies for the limited information approach are grouped about the true value, whereas the 3 highest frequencies for the least squares approach each are to the right of the true value. The average bias of the least squares estimates was almost 3 times as large as that for the limited information approach. In fairness, I should point out that for a second version of this model the 2 frequency distributions are more nearly similar, but the average bias in the least squares estimates still was almost double that for the limited information estimates. This study was carried out at Stanford University by making use of a largescale electronic computer. It is to be hoped that further e:qoeriments of similar character, with different sorts of models, will be undertaken. In discussing methods that are used to handle equations that involve more than 1 endogenous variable, it is convenient to introduce a mathematical concept that deals with the degree of identification . We saw earlier that it is sometimes impossible to estimate the coefficients in certain structural equations with the kind of statistical data available to do the job. Such equations ere said to lack identifi- ability or to be underidentified . Identifiable equations, however, may be just identified or overidentified « In the discussion that follows, we deal only with identifiable equations. The degree of identification relates to individual equations in a system, not to the entire system. By algebraic manipulation, we always can write down the equations in a complete system so that the number of equations equals the number of endogenous variables. We then can think of these as n equations in n unknown endogenous variables, and we can always solve the equations so that each endogenous variable is"e:q)re6sed as a function of all of the predetermined variables in the system. These are called reduced form equations. Since each reduced form equation contains only a single endogenous variable, we can obtain estimates of the regression coef- ficients in these equations that are statistically consistent and efficient by use of the method of least squares. If each equation in the system is just Identified, there is always a unique transformation by which we can go from the coefficients in the reduced form equations to the coefficients in the structural equations. I am sure that many of you have seen this transformation for simple systems of equations, but perhaps it is worthwhile to run through an example to make s\ire that its nature is clear to everyone. Suppose that we have the following structural demand and supply equations, where - 11 £ and 2 have their usual meaning; y is consumer income, v represents important weather factors that affect supply, and each variable is e:q)ressed in terms of deviations from its respective mean: p + bj^y (2) ?» • ' q. = h^2P ^22^ We have h variables in the system and 3. variables in each equation; since 4-3 equals the number of endogenous variables in the system ininus one, we can assume that each equation is just identified. Several rules of thumb like this are available to determine .the degree of identifi- cation; more exact rules depend on the rank of certain matrices. If we substitute the right-hand side of equation (3) tor q in equation (2) and the right-hand side of equation (2) for p in equation (3) and simplify terms, we obtain the following: p = ^11^22 w + ^12 ,y (4) l-biib2i . q = ^22 w + °12^21 y (5) ^‘^ 11^21 , Since the denominator of each term' on the right of the equality sign is identical, we can ignore these denominators for the moment. If we divide the coefficient of w in equation (4) by the coefficient of. w in equation (5), we obtain an estimate of, bi i ^ If we divide the coef- ficient of y in equation (5) by the coefficient of y in equation (4)^ we obtain an estimate of bg^i Given an estimate of h-\ i and we can estimate bnp from the coefficient of y in equation t4) and bpg from the coefficient of w in equation (5). This gives the 4 regression coef-^ ficients needed for our structural equations. Estimates that are uniquely equivalent are obtained by any alternative algebraic manipu- lation. Since the b’s are known to be statistically consistent estimates^ the estimates of the structiiral coefficients obtained in this way are statistically consistent, ' Computationally, 'we may 'wish to estimate the coefficients in another way, but the answers ’obtained are identical to those that would have been gotten by an algebraic manipulation of the regression coef- ficients from the reduced form equations. Let us now consider an equation that is overidentified. Suppose that our supply equation contains a second predetermined variable, z, that represents lagged values of prices. We now have 5 variables in the system. The -supply equation still is just identified, since 5-4 equals the number of endogenous variables in the system minus one. However, the demand equation is over identified, since 5-3 is greater than 2-1, - 12 . The new reduced form equations can he obtained by the same general approach as used previously^ but the result now looks like this; b b p = 11 22 b b w + 11 23 b_ _ z + 12 y (6) ^•^11^21 l-b^b2i q = ^^22 w + ^23 z + ’’l2’’21 y (7) b-biibgi l-bi3_b2x With this set of equations, could be estimated either by dividing. the coefficient of w in equation t^>) by the coefficient of w in equation (7) or by.dividing the coefficient of z in equation (6) by the coefficient of z in equation (?). Different answers are obtained from the 2 estimates. It is in this way that overidentified equations differ from just identi- . fied ones; for overidentified equations, we have an oversufficiency of information and no direct way to decide which answer to use. In fact, neither answer obtained by the use of reduced form equations is statistically consistent. It would be possible to solve the 2 structural equations directly for the several coefficients involved by making use of a maximim likelihood a,pproach. Maximum likelihood estimates are known to be statistically consistent and efficient. They are used widely in statis- tical work because the necessary equations always can be derived by performing certain mathematical operations that involve the maximization of the socalled likelihood function. The general approach is the same as for any maximization process by use of calculus and it is not difficult. For complex systems of equations, however, the mathematics involved in solving the resulting equations is generally complex. That part of Klein referred to in footnote 9' iiivolved the derivation of maximum likelihood estimates. Methods for obtaining maximum likelihood estimates based on a simu3.taneous’ solution for all of .the structural equations are discussed by Klein and in Cowles Commission Monograph Inl- and are called full-information maximum likelihood estimates but, to quote Klein, the computations involved in general are ’’formidable." Hence this method is seldom used. Another method, developed by staff members of the Cowles Commission, is called the single -equation limit ed - informat ion maximum- likelihood method. In this approach, equations are fitted one at a time and information regarding variables that appear in each of the ^ other equations of the system is disregarded. Use is made, however, of all endogenous and predetermined variables that appear in the equation and of all other predetermined variables that appear in the system. This method is known to give esti^tes of the regression coefficients that are statistically consistent and as efficient as » . any other method that utilizes bhe same amount of information. A •;ri - 13 - con^promls^ -betwen this feind the full-information method is one called ^ the limited -information subsystem method, i^ which, selected groups of equations; are fitted as a unit. Most of th^ systems .of simultaneous equations that have been fitted and that involve overidentified equations have been based on the single -equation limited information approach. This is the approach that generally is meant when reference is made to the. .use of the limited information method. 1 For some applications of the limited information approach, I would like to call your attention to Cowles Commission Monograph 15 by Hildreth and Jarrett 14/ dealing with the livestock economy; Iowa ■Research Bulletin 4l0 by Nordin, Judge, and Wahby 15/ dealing with livestock products; and USDA Technical Bulletin II 3 S by Meinken 16 / on wheat, to be issued by us this fall. This approach has been used in ’ several other studies,' but each of the.se devotes considerable attention to broad methodological aspects. ■^Some Computational Considerations If certain equations in a system of equations involve only a single endogenous variable, these can be fitted by the least squares approach. If other equations- are just identified, they can be fitted by the method of reduced forms. If still other equations are over- identified, they can be fitted by the single -equation limited information approach. If the full information, approach were to be used, all equations would be fitted simultaneously. But, 'even if, in essence, each equation is to be fitted separately, similar' computations based on the same data frequently are involved for each analysis , • Hence considerable compu- tational time may be saved by handling the computations as a unit. 'Even if several equations can be fitted directly by least squares and no other equations are involved, time frequently is Saved by carrying ' out the computations as a simultaneous operation for the group. Which- ever approach is used, the answer obtained is the same; here we are concerned only vrlth computational efficiency. If time permitted, I would give some concrete examples to 'illustrate this point. - We have in process a computational handbook in which several such exair^les are discussed in detail. 17 / 1 woizld like to tell you just what 4s included in this handbook. First we< describe a new method for handling ordinary multiple ‘regression analyses. To quote the preface, "This method is believed to=be more efficient than ' 14/ Hiidreth, Clifford, and Jarrett, Frank. - A Statistical Study of Livestock Production and lyfeirketing. Cowles Commission for Research in ' Economics "Monog, 15 . 1955* . 15 / Nordin, J. A,, Judge, George G., and Wahby, Omar. Application of Econometric Procedures to the Demands for Agricultural Products. Iowa Agr. Ej^t, Sta. Research' Bui. 4l0, July 1954. 16 / Meinken, Kenneth W; The Demand and Price Structure for Wheat. U. S. Dept. Agr, Tech. Bul^^4136. ‘ (In press.) 17 / Friedman, Joan and Foote, Richard J. Computational Procedures for Handling Systems of Simultaneous Equations. those now concnonly in use and is easier for beginners to understand than those based essentially on the Doolittle approach^ since no back solution is required," Then we consider methods for handling equations that involve 2 or more endogenous variables, using essentially the same procedure for equations that are either Just identified or overidentified. We discuss in detail the exact steps required for equations containing specified numbers of endogenous and predetermined variables. Then we consider the previously mentioned exaniples of complex systems of equations. In each case, we show how to obtain the coefficients and their respective standard errors. Finally, we take up tests that should be made after the analyses are completed, correlation concepts that apply to systems of equations, and the use of systems of equations for analytical purposes or forecasting. To quote the preface again, "This handbook is designed to provide a complete description of the steps involved in the more common types of problems and to illustrate them in a way that will be clear to research and clerical workers who have an acquaintance only with standard methods for handling single - equation multiple -regression analyses." The handbook should be available late this fall. Carbon copies of it are already in use in our central computing unit. Formulation of Models In my opinion, there should be no question in the minds of research analysts as to whether they should use single -equation or simultaneous -equation methods for particular equations or groups of equations, but decisions must be reached regarding the complexity of the model to be used and the degree of aggregation. Klein has a use- ful suggestion here in connection with his discussion of sector models, 18 / He suggests that a master model for the entire national economy be formulated. As I am sure most of you know, he and staff workers at the University of Michigan have made some progress in formulating models of this type. Models for individual industries, individual commodities, or individual regions can then be formulated and fitted as separate entities in such a way that they can be "grafted" onto the master model. Grafting of this sort is being carried out at the University of Michigan and else- where. In our work for exanple, we have tended to treat variables that relate to the national economy, such as diS;posable income, as though they were predetermined. With Klein's approach, (l) the computed value of disposable income for each observation, based on the master model, • could be used as a predetermined; Variable, or ( 2 ) disposable income could be treated as an endogenous variable and the predetermined r variables on which it depends brought in as predetermined variables in the system for the sector. Either method yields consistent estiinates’, whereas the treatment of disposable income, as such, as a predetermined variable results in some bias. The same general approach could be used in narrower fields i^-' Fbr- exampie, we plan to fit some national aggregate simultaneous ’equation models for the feed -livestock economy in which all feeds and' all 'livestock and livestock products will be aggregated. The model Mil be similar to . - 15 - that used by Hildreth and Jarrett hut will contain what we believe to be some important modifications. At the same time we will be fitting models that relate to individual feeds and to individual types of livestock, either for the country as a whole or for specified regions. Later, less aggregative models may be fitted for separate regions, with separate equations for each major type of livestock, and these regional models then coiild be grafted onto the master feed -livestock economy model, which in turn could be grafted onto a master model for the entire economy. Thus, over time, a series of studies could be built up, each of which would be manageable as a research unit, but all of which would tie together to make a united whole. Concluding Remarks I regret that time has not permitted the presentation of con- crete exan^les of the formulation and use of simultaneous systems of equations, but the references which I cited earlier contain some excellent examples of these, A complete description of any one of them would have required at least the full time allotted to me today, I hope that my remarks have cleared up some of the concepts and terminology used in this area by modern econometricians. In formulating and fitting complex models, the nonmathematician will do well to seek the advice of a competent expert in this area. Fortunately, our graduate schools are turning out an increasing number of such e3<perts. However, it is my belief that anyone who is willing to give the matter careful thought can understand the basic concepts involved and can conduct applied research in such a way that he can use these concepts to advantage in his work with only the occasional help of a mathematician.